Skip to content

Conversation

@guosran
Copy link
Collaborator

@guosran guosran commented Jan 18, 2026

Summary

Implements the OptimizeTaskGraph pass for the Taskflow dialect, enabling Hyperblock Fusion optimization to reduce task scheduling overhead and improve CGRA resource utilization.

Changes

New Files

File Description
HyperblockDependencyAnalysis.h Dependency analysis class definition
HyperblockDependencyAnalysis.cpp RAW/WAR/WAW dependency detection implementation
OptimizeTaskGraphPass.cpp Core optimization pass logic
hyperblock-fusion.mlir Basic fusion test
nested-fusion.mlir Nested loop fusion test
fusion-with-outputs.mlir Loops with different operations test

Core Features

  1. Dependency Analysis (HyperblockDependencyGraph)

    • Detects RAW/WAR/WAW dependencies between hyperblocks
    • Counter compatibility checking (bounds/step matching)
    • Intermediate block dependency conflict detection
  2. Hyperblock Fusion (fuseHyperblocks)

    • Merges hyperblocks with identical counter structures
    • Supports hyperblock fusion
  3. Dead Hyperblock Elimination

    • Removes hyperblocks with no side effects and unused outputs

Future Work

  • Task Fusion (producer-consumer task merging)
  • Loop Peeling (handling mismatched loop bounds)
  • Architecture resource constraint integration

- Add HyperblockDependencyAnalysis for detecting RAW/WAR/WAW dependencies
- Implement OptimizeTaskGraphPass with hyperblock fusion and dead hyperblock elimination
- Handle SSA outputs by creating new hyperblock with combined result types
- Support non-adjacent hyperblock fusion by checking all (i,j) pairs
- Allow RAW dependencies since operation ordering is preserved
- Add hyperblock-fusion.mlir, nested-fusion.mlir, and fusion-with-outputs.mlir tests
- Fix relu_kernel.mlir deterministic checks for upstream compatibility
- Update CMakeLists.txt and TaskflowPasses registration
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements a Hyperblock Fusion optimization pass for the Taskflow dialect to reduce task scheduling overhead and improve CGRA resource utilization. The implementation includes dependency analysis, hyperblock fusion within tasks, and dead hyperblock elimination.

Changes:

  • Adds hyperblock dependency analysis (RAW/WAR/WAW detection) to enable safe fusion decisions
  • Implements hyperblock fusion optimization that merges compatible hyperblocks with identical counter structures
  • Adds three test files demonstrating fusion in different scenarios (basic, nested, with different operations)
  • Updates test expectations in relu_kernel.mlir to reflect changes from the optimization pass

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
include/TaskflowDialect/Analysis/HyperblockDependencyAnalysis.h Defines dependency analysis API for detecting memory dependencies between hyperblocks
lib/TaskflowDialect/Analysis/HyperblockDependencyAnalysis.cpp Implements RAW/WAR/WAW dependency detection and fusion safety checks
lib/TaskflowDialect/Transforms/OptimizeTaskGraphPass.cpp Core optimization pass implementing hyperblock fusion and dead code elimination
lib/TaskflowDialect/Analysis/CMakeLists.txt Adds analysis library to build system
lib/TaskflowDialect/Transforms/CMakeLists.txt Links optimization pass with analysis library
lib/TaskflowDialect/CMakeLists.txt Adds Analysis subdirectory to build
include/TaskflowDialect/TaskflowPasses.td Defines pass options and documentation
include/TaskflowDialect/TaskflowPasses.h Adds pass creation function declaration
test/multi-cgra/taskflow/optimization/hyperblock-fusion.mlir Tests basic hyperblock fusion scenario (currently tests non-fusion case)
test/multi-cgra/taskflow/optimization/nested-fusion.mlir Tests fusion of nested loops within same task
test/multi-cgra/taskflow/optimization/fusion-with-outputs.mlir Tests fusion with different operation types
test/e2e/relu/relu_kernel.mlir Updates test expectations after optimization pass changes

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@guosran guosran marked this pull request as draft January 18, 2026 17:58
- Complete fuseHyperblocksInTask function comment
- Add null check for getDialect() in estimateHyperblockResources
- Fix posA/posB swap in HyperblockDependencyAnalysis::canFuse
- Fix enableTaskFusion default to false in TaskflowPasses.td
- Update hyperblock-fusion.mlir test description for accuracy
- Revert relu_kernel.mlir to main branch version
@guosran guosran force-pushed the feature/optimize-task-graph branch from 86d7467 to ea09163 Compare January 18, 2026 17:58
@guosran guosran marked this pull request as ready for review January 18, 2026 18:02
@guosran guosran requested a review from Copilot January 18, 2026 18:02
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

@tancheng tancheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plz use snake_case variable naming

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove?

@ShangkunLi
Copy link
Collaborator

Hi~ @guosran, I think a better start for taskflow.task fusion is based on our canoncialzied taskflow dialect representation.

I will determine our affine controller design today and I will sync with you so that you can start.

@guosran guosran closed this Jan 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants